Automatic learning of British Sign Language from signed TV broadcasts

نویسنده

  • Patrick Buehler
چکیده

In this work, we will present several contributions towards automatic recognition of BSL signs from continuous signing video sequences. Specifically, we will address three main points: (i) automatic detection and tracking of the hands using a generative model of the image; (ii) automatic learning of signs from TV broadcasts using the supervisory information available from subtitles; and (iii) generalisation given sign examples from one signer to recognition of signs from different signers. Our source material consists of many hours of video with continuous signing and corresponding subtitles recorded from BBC digital television. This is very challenging material for a number of reasons, including self-occlusions of the signer, self-shadowing, blur due to the speed of motion, and in particular the changing background. Knowledge of the hand position and hand shape is a pre-requisite for automatic sign language recognition. We cast the problem of detecting and tracking the hands as inference in a generative model of the image, and propose a complete model which accounts for the positions and self-occlusions of the arms. Reasonable configurations are obtained by efficiently sampling from a pictorial structure proposal distribution. The results using our method exceed the state-of-the-art for the length and stability of continuous limb tracking. Previous research in sign language recognition has typically required manual training data to be generated for each sign, e.g. a signer performing each sign in controlled conditions a time-consuming and expensive procedure. We show that for a given signer, a large number of BSL signs can be learned automatically from TV broadcasts using the supervisory information available from subtitles broadcast simultaneously with the signing. We achieve this by modelling the problem as one of multiple instance learning. In this way we are able to extract the sign of interest from hours of signing footage, despite the very weak and ”noisy” supervision from the subtitles. Lastly, we show that automatic recognition of signs can be extended to multiple signers. Using automatically extracted examples from a single signer, we train discriminative classifiers and show that these can successfully classify and localise signs in new signers. This demonstrates that the descriptor we extract for each frame (i.e. hand position, hand shape, and hand orientation) generalises between different signers. This thesis is submitted to the Department of Engineering Science, University of Oxford, in fulfilment of the requirements for the degree of Doctor of Philosophy. This thesis is entirely my own work, and except where otherwise stated, describes my own research. Patrick Buehler, Keble College Copyright c ©2010 Patrick Buehler All rights reserved

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Employing signed TV broadcasts for automated learning of British Sign Language

We present several contributions towards automatic recognition of BSL signs from continuous signing video sequences: (i) automatic detection and tracking of the hands using a generative model of the image; (ii) automatic learning of signs from TV broadcasts of single signers, using only the supervisory information available from subtitles; (iii) discriminative signer-independent sign recognitio...

متن کامل

Automatic and Efficient Long Term Arm and Hand Tracking for Continuous Sign Language TV Broadcasts

We present a fully automatic arm and hand tracker that detects joint positions over continuous sign language video sequences of more than an hour in length. Our framework replicates the state-of-the-art long term tracker by Buehler et al. (IJCV 2011), but does not require the manual annotation and, after automatic initialisation, performs tracking in real-time. We cast the problem as a generic ...

متن کامل

Advancing human pose and gesture recognition

This thesis presents new methods in two closely related areas of computer vision: human pose estimation, and gesture recognition in videos. In human pose estimation, we show that random forests can be used to estimate human pose in monocular videos. To this end, we propose a co-segmentation algorithm for segmenting humans out of videos, and an evaluator that predicts whether the estimated poses...

متن کامل

Adapting the Assessing British Sign Language Development: Receptive Skills Test into American sign language.

Signed languages continue to be a key element of deaf education programs that incorporate a bilingual approach to teaching and learning. In order to monitor the success of bilingual deaf education programs, and in particular to monitor the progress of children acquiring signed language, it is essential to develop an assessment tool of signed language skills. Although researchers have developed ...

متن کامل

Large-scale Learning of Sign Language by Watching TV (Using Co-occurrences)

We present a framework that automatically and quickly learns a large number of signs from sign language-interpreted TV broadcasts by exploiting supervisory information available in the subtitles. Our contributions are: (i) we show that, somewhat counter-intuitively, mouth patterns are highly informative for distinguishing words in a language for the Deaf, and their co-occurrence with signing ca...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010